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Recent work draws attention to community-community encounters (“coalescence”) as likely an 
important factor shaping natural ecosystems. This work builds on MacArthur’s classic model of 
competitive coexistence to investigate such community-level competition in a minimal theoretical 
setting. It is shown that the ability of a species to survive a coalescence event is best predicted 
by a community-level “fitness” of its native community rather than the intrinsic performance of 
the species itself. The model presented here allows formalizing a macroscopic perspective whereby 
a community harboring organisms at varying abundances becomes equivalent to a single organism 
expressing genes at different levels. While most natural communities do not satisfy the strict criteria 
of multicellularity developed by multi-level selection theory, the effective cohesion described here is 
a generic consequence of division of labor, requires no cooperative interactions, and can be expected 
to be widespread in microbial ecosystems. 


Over the last decade, the sequencing-driven revolu¬ 
tion in microbial ecology unveiled the staggering com¬ 
plexity of microbial communities that shape the health 
of our planet, and our own M- These ecosystems 
routinely harbor hundreds of species of microorganisms, 
the vast majority of which remain poorly character¬ 
ized. This makes the bottom-up approach to their mod¬ 
eling extremely challenging m, prompting the ques¬ 
tion of whether some effective, top-down theory of the 
community as a whole might be a more viable alterna¬ 
tive mmmm- 

The need for a top-down approach is highlighted by 
multiple experimental observations. The microscopic 
species-level composition of independently assembled 
communities is highly variable even in similar environ¬ 
ments; in contrast, the community metagenome (path¬ 
ways carried by the population as a whole) appears to be 
more stable [5]. Studies of obesity or inflammatory bowel 
disease indicate that these conditions are unlikely to be 
caused by specific “pathogenic species” man]; similarly, 
the healthy human microbiota exhibits no core set of 
“healthy” microorganisms |3]. Thus, the “healthy” and 
“diseased” states of human-associated microbiota appear 
to be community-level phenotypic labels that may not al¬ 
ways be traceable to specific community members. 

Remarkably, the behavior of such macroscopically de¬ 
fined states can be productively studied even as the mi¬ 
croscopic details remain unclear: thus, studies report on 
“lean microbiota” outcompeting “obese microbiota” in 
mice m, or on the efficacy of fecal matter transplant in 
treating C. difficile infections, whereby a “healthy” com¬ 
munity overtakes the “diseased” state [13j . Both exam¬ 
ples can be conceptualized as community-level competi¬ 
tion events, termed “community coalescence”. Although 
poorly understood, such events are widespread in natural 
microbial ecosystems and likely play a major role shaping 
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their structure dJ. Intriguingly, Rillig et al. argue that 
coalescing communities often appear to be “interacting 
as internally integrated units rather than as a collection 
of species that suddenly interact with another collection 
of species” DU- 

Although comparing a community to a functionally in¬ 
tegrated “superorganism” is a recurring metaphor m 
on, a well-established body of theory cautions against 
using such terms loosely m- The formal criteria under 
which a group of organisms can be considered a “mul¬ 
ticellular whole” have been extensively discussed in the 
context of multi-level selection theory (MLS) 1181 . At 
the very least, the established notions of group-level in¬ 
dividuality and “organismality” crucially rely on cooper¬ 
ative traits of group members nu. As a result, the 
formal applicability of the “superorganism” perspective 
appears to be severely restricted, as pervasive coopera¬ 
tion between members must first be demonstrated. In 
particular, the microbiota inhabiting the human gut is 
extremely unlikely to satisfy such criteria. 

However, the utility of a macroscopic community-level 
perspective, and its ability to predict the outcome of 
competition between communities, need not hinge on 
whether they constitute a valid level of selection in the 
strict sense of MLS. It is well known that performance 
of a species is dependent on community context [HI- 
124]: for example, niche-packed communities [23 [26. 
are more resistant to invasion 1271 . Building on these 
ideas, the present work extends the classical model of 
MacArthur [22] to construct a simple adaptive dynam¬ 
ics framework that describes co-evolution in multi-species 
communities [23 123[22] and allows investigating the phe¬ 
nomenon of “community coalescence” in a minimal the¬ 
oretical setting. The central result is a mathematically 
precise analogy established between a community whose 
members can change in abundance and an individual 
organism whose pathways can modulate in expression. 
This analogy concerns the manner in which a community 
interacts with its environment and with other communi¬ 
ties; it does not investigate reproduction, and so does 
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not constitute multicellularity in the established sense 
of the term [18]. Rather than being a limitation, this 
expands the potential applicability of the top-down per¬ 
spective advocated here. While the criteria of “true mul¬ 
ticellularity” are too stringent to apply to most natural 
communities, the phenomenon described in this work is 
a generic consequence of ecological interactions in a di¬ 
verse ecosystem and requires no cooperative behavior or 
“altruism” DU- 


I. METHODS: THE METAGENOME 
PARTITIONING MODEL 
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FIG. 1 . The metagenome partitioning model. Organ¬ 
isms are defined by the pathways they carry, the benefit from 
each substrate is equally partitioned among all organisms who 
can use it, and population growth/death of each species is de¬ 
termined by the resource surplus it experiences. 


To investigate community coalescence in the simplest 
theoretical setting, consider the following model for divi¬ 
sion of labor in large communities. It is closely related to 
MacArthur’s model of competitive coexistence on multi¬ 
ple resources [23]; see Supplementary Material (SM). 

Consider a community in a habitat where a single 
limiting resource exists in N forms (“substrates” i £ 
{1 ... TV}) denoted A, B , etc. For example, this could be 
carbon-limited growth in an environment with N carbon 
sources, or a community limited by availability of elec¬ 
tron acceptors in an environment with N oxidants. The 
substrates can be utilized with “pathways” Pi (one spe¬ 
cialized pathway per substrate). A species is defined by 
the pathways that it carries (similar, for example, to the 
approach of Ref. [30]. There is a total of 2 N — 1 possible 
species in this model; they will be denoted using a binary 
vector of pathway presence/absence: a = {1,1, 0,1,... }, 
or by a string listing all substrates they can use, e.g. 
“species ABD ” (the underline distinguishes specialist or¬ 
ganisms such as A from the substrate they consume, in 
this case A). Let ng be the total abundance of species 
a in the community, and let T) be the total number of 
individuals capable of utilizing substrate i (Fig. [l]): 

Ti = ^ ( TlgCJi. 
a 


less efficient at processing their resources. For simplicity, 
let these costs be random: 

Xs =Xo|d ! |(l+ ££,?)■ (2) 

Here xo is a constant (the average cost per pathway), 
£g is a random variable chosen once for each species and 
drawn out of the standard normal distribution (truncated 
to ensure xs > 0), e sets the magnitude of cost fluctua¬ 
tions, and |<j| = ]Tb cq is the number of pathways carried 
by the species. This factor ensures that expressing more 
pathways incurs a higher cost (in this simple model, car¬ 
rying and expressing a pathway is synonymous). 

The resource surplus A is used to generate biomass. 
The simplest approach is to equate the biomass of an 
organism with its cost, so that the total biomass of a 
species is X3 n 3i and the dynamics of the model are given 
by: 

To Xg -TT = 9a{{ng}) = ngAg. (3) 

at 

The constants xo an d T o set the units of resource and 
time. 

The approach taken here purposefully ignores multiple 
factors, most notably trophic interactions or any other 
form of cross-organism dependence. This is intentional: 
it ensures that the interaction matrix 


Assume a well-mixed environment, so that each of these 
T t individuals gets an equal share Ri/Ti of the total ben¬ 
efit Ri (carbon content, oxidation power; etc.) avail¬ 
able from substrate i (“scramble competition”). Any 
one substrate is capable of sustaining growth, but ac¬ 
cessing multiple cumulates the benefits. The population 
growth/death rate of species a will be determined by the 
resource surplus A experienced by each of its individuals: 


a \' -R* 
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Here the first term is the benefit harvested by all car¬ 
ried pathways, and the second represents the mainte¬ 
nance costs of organism a. These costs summarize all 
the biochemistry that makes different species more or 
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has no positive terms, i.e. the setting is purely compet¬ 
itive (indices a, b label species). This helps underline 
that the whole-community behavior exhibited below is a 
generic consequence of division of labor, and requires no 
explicitly cooperative interactions. 

Other simplifications include the assumption of deter¬ 
ministic dynamics and a well-mixed environment. Al¬ 
though stochasticity and spatial structure are tremen¬ 
dously important in most contexts, the simplified model 
adopted here provides a convenient starting point and 
makes the problem tractable analytically. 

This work will investigate coalescence of communities 
that originate and remain in similar environments, e.g., 
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transfer of oral communities by kissing pITj as opposed 
to invasion of microbes from the mouth into the gut |32| . 
Imagine a collection of islands (or patches) labeled by ct, 
each harboring a community C a experiencing the same 
environment £. The next section investigates the within- 
island dynamics I© to establish some key properties that 
make this simplified model particularly convenient for 
our purposes. Specifically, let f 1(C) denote the set of 
species present at non-zero abundance in a community 
C. It will be shown that under the dynamics © , any 
community C will eventually converge to a stable equi¬ 
librium C* uniquely determined by the set £l(C). Here 
and below, the starred quantities refer to equilibrium of 
ecological dynamics. At this equilibrium, certain species 
S* = f l(C*) establish at a non-zero abundance, while oth¬ 
ers “go extinct”, exponentially decreasing towards zero. 
Importantly, the set of survivors will depend only on the 
identity of the initially present species, and not on their 
initial abundance. Thus a community C\ coalescing with 
C 2 will yield the same community C\ 2 irrespective of the 
initial mixing ratios. While obviously a simplification, 
this makes the metagenome partitioning model an espe¬ 
cially convenient starting point to build theoretical in¬ 
tuition about community-community interactions before 
more general situations can be studied, e.g. numerically. 


These properties are established in the next section; 
the following section then turns to the main subject of 
this work, namely coalescence events between islands. 



Abundance 



FIG. 2. The individual performance rank of a species 
(its cost per pathway) is predictive of its survival and 
abundance in a community. A: Community equilibrium 
for one particular random realization of the model (N = 10, 
e = 10~ 3 ). Species are ordered by abundance and labeled 
by the pathways they carry. Also indicated is the individ¬ 
ual performance rank; all surviving species were within the 
top 30 (out of 1023). B: The median individual performance 
rank of survivors, weighted (dashed) or not weighted (solid) 
by abundance. Curves show mean over 100 random commu¬ 
nities for each e; the standard deviation across 100 instances 
is stable at approximately 40% of the mean for both curves, 
independently of e (not shown to reduce clutter). 


II. SINGLE-ISLAND ADAPTIVE DYNAMICS: 
INTRINSIC SPECIES PERFORMANCE AND A 
COMMUNITY-LEVEL OBJECTIVE FUNCTION 

Numerical simulation of the competition between all 
1023 possible species, initialized at equal abundance, for 
N = 10, e = 10 -3 , Ri = 100xo> and one random real¬ 
ization of organism costs results in an equilibrium state 
depicted in Fig. [2jA(^] In this example it consists of 9 
species. It is natural to ask: for a given initial set of 
competitors, what determines the species that survive? 

In the present model, the only intrinsic performance 
characteristic of a species is its cost per pathway. Con¬ 
sider an assay whereby a single individual of species a is 
placed in environment with no other organisms present, 
and, for simplicity, all substrates supplied in equal abun¬ 
dance Ri = R. The initial population growth rate in this 
chemostat is given by: 
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and abundance eventually equilibrates at ng = R\a\/xs- 
Both these quantities characterize performance of species 
a (the term “fitness” is avoided as it is a micro¬ 
evolutionary concept that, strictly speaking, should be 
defined only within individuals of one species). Define 
the “individual” performance measure of species a as 

(4) 

Xs 

This definition is convenient as it makes fg a dimension¬ 
less quantity of order e. Under the cost model ([2]), the 
performance ranking of species is random, set by the ran¬ 
dom realization of the costs £. 

Predictably, this performance ranking is correlated 
with the success of a species in a community, but not very 
well (Fig. [2]). The equilibrium depicted in panel A pre¬ 
dominantly consists of top-ranked species, and the me¬ 
dian performance rank of surviving species is consistently 
low across a range of e (panel B). This median rank be¬ 
comes even lower if the median is weighted by a species’ 
abundance at equilibrium, indicating that top-ranked 
species tend to be present at higher abundance |22i 33] ■ 
Still, at the equilibrium shown in Fig. the species 
ranked 4th in intrinsic performance went extinct, but 6 
others ranked as low as #29 remained present. 

These observations reflect the well-known fact that the 
success of a species is context-dependent and observing a 
species in isolation does not measure its performance in 
the relevant environment f?3 [ 241 . For example, consider 
the three-substrate world depicted in Fig. [T] and assume 


1 MATLAB scripts (MATLAB, Inc.) performing simulations and 

reproducing Figs. 2-5 are available upon request. 
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that AB is the highest-performing species with a very 
low cost. As AB multiplies, it depletes resources A and 
B (in the sense that the benefit Ri/Ti any organism can 
harvest from them is reduced). As a result, the final equi¬ 
librium is highly likely to include the specialist organism 
(7, even if its cost is relatively high, and under other cir¬ 
cumstances (if AB were less fit) it would have yielded to 
AC or BC . Conveniently, in the model described here, 
these complex effects studied by niche construction the¬ 
ory can be summarized in a single community-level ob¬ 
jective function. The context experienced by all species 
is fully encoded in the vector of “harvests” Hi = Ri/Ti 
available from each substrate, and the dynamics (|3j) pos¬ 
sess a Lyapunov function (compare to MacArthur 1969): 


Ti 


F = kc (£ R 111 -R~h/ ~ £ Xsns + R 


Ri/xo 


(5) 


Here f? to t is a constant introduced for later convenience. 
Specifically, set Rtot = Ri. ; this choice ensures that 
close to community equilibrium, F is also of order e (see 
SM). This function, defined for ng > 0 and Ti > 0, has 
the property that Rtot = A^, and therefore 


dF dF dng 

dt dng dt 
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Thus F is monotonically increasing as the system is con¬ 
verging to equilibrium. To illustrate this, Fig.[d]shows 10 
trajectories of ecological dynamics for the same system as 
in Fig. IN starting from random initial conditions (with 
all species present; see SM). Far from equilibrium, while 
most high-cost species are being eliminated by competi¬ 
tors, the mean intrinsic performance of surviving organ¬ 
isms and F increase together (Fig. [3} inset), confirming 
that intrinsic performance is a useful predictor. How¬ 
ever, as equilibrium is approached, community-induced 
changes in substrate availability Hi reduce the relevance 
of the original performance ranking, which was measured 
in the “wrong” environment. The performance rank or¬ 
dering will be all the more sensitive to the environment 
Hi 1 the smaller the scatter of intrinsic organism costs e. 
Therefore, the role of this parameter is to tune the rel¬ 
ative magnitude of intrinsic and environment-dependent 
factors in determining a species’ fate. So far, e was fixed 
at 10~ 3 « 2 _JV , and Fig. [2^3 shows that for small e, the 
structure of the final equilibria does not significantly de¬ 
pend on this parameter (see SM). The large-e regime will 
be discussed later. 

Each of the trajectories in Fig.[3]converges to the same 
equilibrium (depicted in Fig. [2}A.) . This is because F is 
convex and bounded from above (see SM). Therefore, 
for every set of species H, any community restricted to 
these species will always reach the same (stable) equi¬ 
librium, corresponding to the unique maximum of F on 
the subspace Vq defined by the conditions {ng = 0 for 
all a ^ Q}. This maximum will often be at the border 


Convergence to community equilibrium 



{f) / e (mean intrinsic performance) 


FIG. 3. Community dynamics maximize a global ob¬ 
jective function F. 10 trajectories of ecological dynamics 
for an example system, starting from random initial condi¬ 
tions and converging to the equilibrium depicted in Fig. 
Inset: a zoomed-out version of the same plot; data aspect 
ratio as in the main panel. Mean intrinsic performance of 
community members is weighted by their abundance. Direc¬ 
tion of dynamics indicated by arrows. 


of this subspace, corresponding to the extinction of some 
species. 

Under the dynamics ([3|, no new species can “appear” 
if their original abundance was zero. Imagine, however, 
that on each island, a rare mutation (or migration) oc¬ 
casionally introduces a random new species; if it can in¬ 
vade, the community transitions to a new equilibrium 
and awaits a new mutation. This process of adaptive 
dynamics defines the evolution of each island, and can 
be seen as a mesoscopic population genetics model for 
a multi-species community evolving through horizontal 
gene transfer (loss/acquisition of whole pathways). For 
each island, F is monotonically increasing throughout its 
evolution. Indeed, F is continuous and non-singular in 
all ng , so introducing an invader at a vanishingly small 
abundance will leave F unchanged, and the subsequent 
convergence to a new equilibrium is a valid trajectory of 
ecological dynamics on which F increases. Importantly, 
at any equilibrium, 





(see SM), and so the value of F at community equilibrium 
is a quantity that depends only on macroscopic quanti¬ 
ties, namely the community-wide pathway expression Ti. 
The following sections will argue that F can be thought 
of as community-level “fitness”, but this term will not be 
used until justification is provided. 
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III. THE COMMUNITY-LEVEL FUNCTION F 
PREDICTS THE OUTCOME OF COMMUNITY 
COALESCENCE 


Consider now a coalescence event whereby the equi¬ 
librium communities from two islands C* and Cp are 
brought into contact; as established above, the result¬ 
ing community C* will not depend on the details of the 
mixing protocol. If none of the species from island /3 
could invade the community C*, then C* = C* and the 
community C* is the clear winner. In general, however, 
the space of competition outcomes is richer than merely 
one community taking over: both competitors C*, Cg can 
contribute to C*, but can be more or less successful at do¬ 
ing so, contributing more or fewer species. What makes 
a community more likely to be successful? 

The community on each island a constructs its own 
environment {H^}. When species from island a are 
introduced onto island /3, they are exposed to a random 
new environment, and the equilibrium environment { H *} 
that the coalescence survivors will create for themselves 
will be different still. Although the success of a species 
is environment-dependent, for a random environment, fg 
as defined above remains the best available performance 
predictor. One may therefore expect that the more suc¬ 
cessful community should be the one with more high- 
performance species. On the other hand, we also found 
that the ultimate equilibrium community that cannot 
be invaded by any species does not consist of species 
with the highest intrinsic performance, but corresponds 
to the global maximum of F. This suggests that the 
community-level function F should be the better predic¬ 
tor of the competition outcome. If so, it could be said to 
characterize the “collective fitness” of a community (in 
the restricted, purely competitive, rather than reproduc¬ 
tive, sense). 

To settle the competition between these two hypothe¬ 
ses, the following procedure was implemented. For N = 
10, e = ICY 3 , and a given random realization of the cost 
structure £, M = 50 random species were selected to al¬ 
low for an exhaustive sampling of sub-communities (the 
results reported below do not significantly depend on this 
choice). This set was used to construct all ( 5 4 °) = 230300 
possible combinations of k = 4 species that were inde¬ 
pendently equilibrated; instances where the equilibrium 
state had fewer than k = 4 species or where some path¬ 
ways were not represented were excluded. The putative 
collective fitness F of the remaining 70160 communities, 
and the mean individual performance of their members, 
are shown in Fig. §V. This procedure puts at our dis¬ 
posal multiple examples of communities where the two 
performance measures are both high, both low, or one is 
high while the other is low (the quadrants highlighted in 
Fig. |§\). Competing pairs of communities drawn from 
these pools will make it possible to determine which of 
the two factors, individual performance of a species fg 
or the collective fitness F of its native community, can 
better predict its post-coalescence survival. 


To begin, consider the competition between the cyan 
and magenta quadrants (I and III, respectively). Com¬ 
munities from the magenta quadrant are predicted to be 
more fit, both in the collective sense and as measured by 
the average intrinsic performance of members. Therefore, 
one expects that the magenta (III) communities should, 
on average, be more successful in pairwise competitions. 
To confirm this, Fig. |4j3 presents the results of an “elim¬ 
ination assay” competing communities from these quad¬ 
rants. 500 random pairs were drawn, and correspond 
to columns in Fig. [4}3. For each pair, species from both 
communities (up to 8 each time) were equilibrated to¬ 
gether; the rows in Fig. [4j3 correspond to these species, 
ordered by individual performance rank: high (top) to 
low (bottom). For each species that went extinct during 
equilibration, its provenance was identified (“did it come 
from the magenta or the cyan community?”), and the 
corresponding rectangle in Fig. [4)3 was colored accord¬ 
ingly; in the rare cases when the eliminated species was 
originally present in both communities, it was colored yel¬ 
low. The dominant color in Fig. [4)3 is cyan, confirming 
that the cyan communities are typically less successful at 
contributing their members to the final equilibrium. Note 
also that the colored entries are predominantly located 
in the bottom half of the table: the eliminated species 
tend to also have lower intrinsic performance than their 
more successful competitors. This is the expected result. 

Now, consider the competition between blue and red 
quadrants (II and IV). An elimination assay conducted 
in an identical manner is presented in Fig. [4p. Now the 
colored entries are predominantly red and occupy the 
top half of the table. In other words, members of the 
red communities are being outcompeted despite the fact 
that their intrinsic performance is higher: the individual 
performance of a species is less predictive of its ability 
to survive coalescence than the collective fitness of the 
community of which it was part. 

Finally, 5000 random community pairs from the pool of 
Fig. [4j4 (not restricted to any quadrant) were competed. 
Define community similarity for C\ = {n\g} and C 2 = 
{n 2 g} as the normalized scalar product of their species 
abundance vectors: 


S(Cr,C 2 ) = 


/F ^ nigTi2o 


£5 


13 


£** 


2 

23 


For each of the 5000 coalescence instances C\ + C *, 
Fig. [4p plots the similarity Si = S(Ci,C*) as a func¬ 
tion of fitness difference between “parent” communities 
Ci and Cj. It comes as no surprise (c/. Fig. [ 2 ]) that 
the predictive power of the mean individual performance 
is extremely weak (black line). In contrast, community 
fitness is a strong predictor: the larger the difference in 
community fitness, the stronger the similarity between 
the post-coalescence community and its more fit parent 
(red line). In the mathematical framework developed 
here, the observation that coalescing communities ap¬ 
pear to be “interacting as coherent wholes” acquires a 
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FIG. 4. Community fitness is more predictive of competition outcome than the intrinsic performance of its 
members. A: Community fitness F vs. mean intrinsic performance (fg) of its members, measured in units of e, for 70160 
communities composed of 4 species (see text). Communities in which both characteristics are in the top or bottom 10% are 
highlighted. B: Elimination assay competing quadrants I (cyan) vs III (magenta). 500 randomly drawn community pairs 
(columns) were jointly equilibrated, with up to 8 species each time (rows; ordered by fg). For each species that went extinct 
during equilibration, the corresponding cell in the table is colored by the species’ provenance. As expected, most eliminated 
species were from the less fit cyan communities (there are more cyan cells than magenta). These species also had lower fg 
(most colored cells are in the lower half of the table). C: Same, competing quadrants II (blue) vs IV (red). The dominant color 
is now red: most eliminated species were from red communities, and went extinct despite having higher fg (most colored cells 
are in the upper half of the table). Columns ordered by dominant color. D: Community similarity S(Ci,C) for a coalescence 
event depicted in the cartoon (inset), computed for 5000 random community pairs, as a function of fitness difference between 
competing communities. Fitness difference scaled to the maximum of 1 so both fitness measures can be shown in same axes. 
Shown is binned mean (8 bins) over communities with similar fitness difference (solid line) ±1 standard deviation (shaded). 


precise formulation. Without implying the emergence 
of any new level of selection, and without invoking any 
cooperative traits, we observe that community coales¬ 
cence can be usefully described as an interaction between 
two entities, characterized macroscopically at the whole- 
community level. 


IV. THE “COMMUNITY AS AN INDIVIDUAL” 
METAPHOR BECOMES EXACT 

Consider now an external observer who is de¬ 
nied direct microscopic access to community composi¬ 
tion, and is able to perform only “metagenomic” (or, 
rather, “metaproteomic”) experiments, measuring the 
community-wide pathway expression T = {Tj} in re¬ 
sponse to substrate influx R = {Ri}. 

First, consider an island Qg harboring a single species: 
the complete generalist ctg = {1,1...1}. Its abundance 
at equilibrium will be no = T) = Rtot/XG- Although 
substrates may be supplied in varying abundance, the 
island ag can only express all pathways at the same level. 

Another island as might harbor a community of per¬ 
fect specialists: A = { 1 , 0,0 ... }, B = { 0 , 1,0 ... }, etc. 
Faced with an uneven supply of substrates, this island 
will adjust expression levels T) to precisely track the sup¬ 
ply vector Ri, so that T) = Ri/xi, where Xi is the cost of 
the respective specialist. For an external observer whose 
toolkit is limited to investigating the mapping R i—>• T, 
the specialists’ island as is formally indistinguishable 


from an organism who can sense its environment and 
up-regulate or down-regulate individual pathways. 

Such perfect regulation is, however, costly: typically, 
A, B, etc. will not be the most cost-efficient combina¬ 
tions. As a result, allowing the community to evolve 
while holding R fixed, one will obtain a different multi¬ 
organism community C. Unlike as, it will generally be 
unable to respond to all environmental perturbations: for 
example, the 9-species equilibrium community of Fig. UK 
will necessariy be insensitive to some direction in the 10- 
dimensional space of substrate concentrations. Our ex¬ 
ternal observer will conclude that evolution in a stable 
environment has traded some of the sensing capacity for 
the ability to fit a particular substrate influx with more 
efficient pathway combinations. 

The model presented here can therefore be reinter¬ 
preted as a model for adaptive evolution of a single or¬ 
ganism striving to better adjust its response T to the en¬ 
vironment R it experiences. The model specifies how the 
genotype (patterns of pathway co-regulation) determines 
phenotype (the mapping R i —> T), and the competitive 
fitness F as an explicit function of both the genotype 
and the environment [ 31 ] ■ To conclude this section, let 
us compute the community fitness F of the single-species 
generalist community ag for the case Ri = R. Apply¬ 
ing the definition <§, and using Tj = no = NR/\g one 
finds: 
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F = 


1 



+ 1 


= In 


N\o 

XG 


ln(l + /g) ~ /g 


where fa is the individual performance Q of organism 
erg, and the approximate equality holds because fa is 
of order e, assumed small. In other words, for a single¬ 
species community, the community fitness coincides with 
the individual performance of that species, reinforcing 
the emergent parallel between a community and an in¬ 
dividual that had evolved an internal division of labor. 
This interpretation is specific to the particular model ex¬ 
plored here, but within this model, the metaphor is math¬ 
ematically exact. 


V. COMMUNITY COHESION AS A GENERIC 
CONSEQUENCE OF ECOLOGICAL 
INTERACTIONS 

It is important to contrast the results of the previ¬ 
ous section with the notion of “fitness decoupling” in 
multi-level selection theory (MLS). In MLS, a higher 
level of organization is recognized when a group of co¬ 
operating organisms acquires interests that are distinct 
from the self-interest of its members |18j . Here, compe¬ 
tition always remains entirely “selfish”. In each instance 
of community competition assayed in Fig. [4j whenever 
some species invaded a community, it was because its 
fitness in that particular environment was higher than 
the fitness of species already present. In contrast to fit¬ 
ness decoupling, which requires special circumstances to 



evolve, the community-level cohesion described in this 
work is a generic consequence of the fact that organisms 
modify their environment, and that fitness is context- 
dependent [251 HH 151 } 155]. 

The definition 0 corresponds to how we might ex¬ 
perimentally measure fitness, by placing an organism 
in a “typical” environment it is believed to experience. 
In the model described here, this typical environment 
is often an excellent approximation: for a community 
at equilibrium with equiabundant substrates Ri = R, 
the total community-wide expression of each pathway is 
roughly T ss R/xo > the same for all i. Nevertheless, even 
small deviations may be sufficient to induce substantial 
reordering of the relative performance rank of different 
species, in which case the context-dependent component 
of fitness can become dominant. 

If this interpretation of the results of Fig. [4] is correct, 
then reducing the degree to which environmental pertur¬ 
bations affect relative fitness of individuals should lead 
to a tighter link between community fitness and individ¬ 
ual species’ performance. This prediction can be tested 
by increasing e, the parameter that determines the width 
of the distribution of organism costs. For example, con¬ 
sider a community where the substrate A is disputed by 
only two organisms: A and AB . Assume that /a > Jab , 
so that when substrates A and B are equally abundant, 
the species A displaces AB . Reducing the availability 
of substrate A can reverse this outcome (if A is absent, 
AB can still survive, but not A). However, the larger 
the difference in intrinsic performance /a and Ja_b, the 
more extreme such resource depletion would have to be. 
Therefore, increasing e will reduce the relative effect that 
changing environment has on fitness rank ordering. Fig. [5] 
repeats the analysis of Fig. |d]4 for e = 0.1 (rather than 
e = 10 -3 used previously). As predicted, the collective 
fitness is now strongly associated with the performance of 
individuals. In fact, this is already apparent in Fig. [2j3: 
as e is increased, the median fitness rank of survivors 
at the final equilibrium begins to reduce. At high e, it 
is increasingly true that high collective fitness is merely 
a reflection of high intrinsic performance of community 
members. Thus Fig. 03 documents a transition between 
a largely individualistic regime (at large e) and a regime 
where the genetically inhomogenous assembly of species 
increasingly acts “as a whole”, in the precise sense dis¬ 
cussed in the previous section. 
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FIG. 5. Parameter e tunes the magnitude of commu¬ 
nity cohesion. Same as Fig. Eh. for larger e = 0 . 1 . In¬ 
creasing e reduces the relative importance of environment in 
determining the performance ranking of species. As a result, 
collective fitness of a community and the mean individual per¬ 
formance of its members remains strongly coupled. Defining 
quadrants as in Fig. Eh leaves the blue and red quadrants 
empty. 


This work presented a theoretical framework where the 
analogy between a community harboring organisms at 
varying abundances, and an organism expressing genes 
at different levels, becomes an exact mathematical state¬ 
ment. A striking feature of this perspective is the blurred 
boundary between the notions of competition and genetic 
recombination [Sami- Consider competition between 
organisms as an operation that takes two organisms and 











yields one: 


Competition: (0i,0 2 ) >->• 0*. 

Traditionally, the space of outcomes is binary: one com¬ 
petitor lives, one dies, and the propensity to survive com¬ 
petition is called fitness. When competition between 
communities of organisms is considered, this definition 
must inevitably be generalized to allow 0* to be distinct 
from either of the original competitors. Such “competi¬ 
tors” , however, might be more aptly named “parents”. 
In sexual reproduction, recombination allows a subset of 
the genes inherited from both parents to form progeny 
with potentially higher fitness; here, the competition be¬ 
tween parent communities C* and Cg allows a subset of 
their members to regroup into a daughter community C* 
with a higher collective fitness F. The parallel becomes 
especially clear if one imagines propagules of C* and Cp 
co-colonizing a fresh environmental patch. 

Such member regrouping can be much more flexible 
than the rules of sexual recombination, but reduces to the 
latter in the particular case of communities with clearly 
demarcated functional guilds (e.g., consider competition 
between two communities that each has one plant, one 
pollinator, one herbivore, one carnivore, etc.). Long be¬ 
fore the evolution of sex, such recombination would have 
allowed communities with divided labor to fix evolution¬ 
ary novelty more efficiently than a clonal population of 
generalists. Although the metaphor of a genome as an 
“ecosystem of genes” is not new [38], the framework pre¬ 
sented here allows it to be formalized and investigated 
quantitatively. 

The results in this work were derived within the simpli¬ 
fied framework of a particular model where microscopic 
dynamics conveniently took the form of optimizing a 
community-level objective function. In general, of course, 
collective dynamics are almost never reducible to solv¬ 
ing an optimization problem |35j . However, conceptu¬ 
ally, the statement that environment-dependent species 
performance translates into an effective cohesion of co¬ 
alescing communities is merely a generalization of the 


classical result that niche-packed communities are more 
resistant to invasion |40| . which is recurrent across mul¬ 
tiple modeling frameworks m- In the model at hand, 
the existence of a global objective function made this 
phenomenon particularly easy to investigate; in a more 
general model, it wouldn’t be possible to quantify this 
effect with a single number (the “community fitness”). 
Nevertheless, the qualitative result may be expected to 
persist, so that members of a co-evolved community with 
a history of coalescence would tend to have higher per¬ 
sistence upon interaction with a “naive” community that 
had never been exposed to such events, as proposed in 
Ref. M- More work is required to verify the generality 
of this hypothesis. 

The results presented here, derived in a purely com¬ 
petitive model, demonstrate that functional cohesion is 
conceptually separate from the discussions of “altruism” 
and cooperation except to the extent described by 
the formula “enemy of my enemy is my friend” (indirect 
facilitation my The latter can be seen as a form of co¬ 
operation [35], but is a generic phenomenon and is not 
vulnerable to “cheaters”. 

While the criteria of “true nmlticellularity” are too 
stringent to apply to most natural communities, the phe¬ 
nomenon described in this work is a generic consequence 
of ecological interactions in a diverse ecosystem. If whole- 
community coalescence events are indeed a significant 
factor shaping the evolutionary history of microbial con¬ 
sortia, then community-level cohesion of the type de¬ 
scribed here can be expected to be broadly relevant for 
natural ecosystems 0. 
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SUPPLEMENTARY MATERIAL 


1 . Relation to the model of MacArthur 

The dynamics ([3| can be written as 



F is bounded from above. To see this, note the inequal¬ 
ities: 

Tj = ^ \a\ng < N ^ ng 
i B B 

and for a, /3 > 0: 


where denotes the “available resources”. In the model 
considered in this work, A, = Ay MacArthur (1969) 
considered a model of species competing for renewing 
resources. In that model, the dynamics of organism pop¬ 
ulations were identical to (SI), but the availability of 
resources was given by Aj = Ri(l — Ti/rf) (see equations 
(l)-(3) in MacArthur 1969), where the extra parameter 
r,; is the renewal rate (or the “intrinsic rate of natural 
increase”). 

The dynamics of the two models, therefore, differ 
only by the choice of the functional form relating pop¬ 
ulation growth and the corresponding decrease of re¬ 
source availability. The mapping between the notations 
of MacArthur 1969 (“MA”) and those used here is pro¬ 
vided in the table: 


a In x — fix < a In —- 

efd 

Using these, and setting min^ Xff = X* > 0> one can 
write: 


F < Y, R In Ti - x* ^ Y, \ R ‘ lnTi ~ 

i B i ' 

< ^2 Ri ln 


N 

NR., 


ex 


F is convex. To see this, note that for any function 
fin), the following two operations leave its convexity in¬ 
variant (M is an arbitrary matrix): 


Notation for... 

MA 

Here 

Species index 

i 

a 

Species abundance 

Xi 

TIb 

Resources a species can harvest 

aij 

Cfi 

Resource carrying capacity 

k 3 

Ri 

Minimal resource requirement 

R 

XS 

“Resource weight” 

Wi 

l 

Resources i— > biomass conversion factor 

Cl 

{rxsR 1 

Resource renewal rate 

r .i 

N/A 


In the work of MacArthur, each species i was described 
by an arbitrary chosen vector of parameters (proba¬ 
bility to encounter and consume resource j). The space 
of possibilities is unconstrained, and the types available 
to form a community are fixed by historical contingency; 
MacArthur then asks how many species can co-exist in 
this way. In the model considered here, are con¬ 
strained to be 0 or 1. The setting is treated as an adaptive 
dynamics model where species are allowed to acquire or 
lose pathways, and the outcome of this co-evolution is 
investigated. 

Reformulating community dynamics as an optimiza¬ 
tion problem was first done in MacArthur 1969; here, 
because of the difference in the way resource consump¬ 
tion is treated, the objective function being optimized 
is different, but the argument is similar. Consider the 
following objective function: 

F = J2 R ‘ l (S2) 

i B 

defined for {ng > 0}, and differing from the definition of 
Eq. § only by normalization. 


1. adding a linear function of its arguments: 

f{n) gin) = fin) + Mn; 

2. performing a linear transformation of its argu¬ 
ments: 


fin) h{n) = fIMn). 

Given these observations, convexity of F, and therefore 
also the convexity of F as defined in ([5]) , directly follows 
from the convexity of the logarithm. 

The main text demonstrated that F is always increas¬ 
ing along the trajectories of the model. Thus, for any 
initial community state C, ecological dynamics converge 
to the equilibrium corresponding to the unique maximum 
of F on the domain {ng > 0 for a £ U(C)}. Since F is 
bounded and convex, the final equilibrium always exists 
and is unique and stable. 


2 . Normalization of community fitness 


The typical value of F as defined in equation (S2) for 


a community close to equilibrium can be estimated as 
follows. 

To estimate the first term, note that the cost per path¬ 
way of all organisms is close to XOi an d therefore the 
overall expression Tj is approximately Ti ~ R/x o- 

The second term is the total cost of all organisms in the 
population )>2gngXs- At any equilibrium, it is equal to 
the total resource abundance R to t = (C,; R ■ This can be 
seen in two ways. One approach is to use the equilibria 
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conditions to express the cost of all present organisms in 
terms of resources: 


W G f 1(C): xs = 


v-* Ri 

Y ai Y 


Therefore, 


Ri 


Y n SXa = n ° G ' ) Y 


Y R - 


Alternatively, this same equation can be derived from the 
condition of maximization of F, by setting ng = Mpg , 
and requiring = 0. 

Putting these observations together, the expectation 
for the value of F at any equilibrium is therefore 


F = Y R i inT i~Y XS n a = Y Ri 1 nTi ~Y Ri 

i & i i 

~Y R i HRi/x o) ~Y Ri = (S3) 

i i 


this work, is nevertheless instructive to address. A simple 
linear algebra argument demonstrates that in the model 
considered here, this maximum number is N: a stable 
coexistence is possible only for a number of types that is 
at most equal to the number of resources. This is because 
for a given set of K types, the K equilibria conditions 
Ag = 0 can be seen as a linear mapping between the N- 
dimensional vector R%/Ti and a A'-dimensional vector of 
organism costs \g. In the generic case (i.e. if no special 
symmetries exist in the cost structure), the existence of 
such a mapping requires K < N. 

Symmetries in the cost structure can lead to degen¬ 
erate equilibria circumventing this maximal coexistence 
condition. Imagine, for example, that all organisms have 
the exact same cost per pathway Xo- In this maximally 
degenerate case any combination of functional types can 
coexist, provided that T) = Ri/x o : no division of labor 
strategy is better than any other. 


5. Numerical determination of community 
equilibrium 


When defining community fitness, it is natura l to sub¬ 
tract this baseline value from F as defined in (S2|, and 
normalize by Rtot- 


E* Ri' 

This is the normalization chosen in equation (|5j) in the 
main text. 


3. Sensitivity to the value of e 


Fig. 2B demonstrates that for small enough e, the 
structure of the final equilibria does not significantly de¬ 
pend on this parameter. This can be intuitively under¬ 
stood as follows. Consider two resources A, B and organ¬ 
isms A = {1, 0}, B = {0,1}, and AB = {1,1}. If 

X AB > XA + XB, (S4) 


it easily follows that the “generalist” organism AB will 
eventually be outcompeted by the two specialists A and 
B. Conversely, if the opposite inequality holds, then A 
and B_ cannot stably coexist in the final equilibrium, since 
AB will always be able to invade, displacing one (or both) 
of them. In this way, in the metagenome partitioning 
model, community composition is shaped primarily by 
inequalities like (S4), which are invariant under changes 
in e and depend only on the realization of the “noise” £. 


4. The maximum number of coexisting types 

The traditional question of how many types can coexist 
for a given set of parameters, although not at the focus of 


To determine the equilibrium state established through 
competition of a given set of I\ species, one could imagine 
choosing a random starting point with a non-vanishing 
abundance of all K competing species, and evolving it ac¬ 
cording to the dynamical equations for time t —» oo. The 
Lyapunov function guarantees that such evolution would 
converge to an equilibrium state. However, if K N 
(for example, K = 1023 and N = 10 in Fig. 2A), such a 
procedure is highly memory-intensive and wasteful, since 
the final population is guaranteed to contain at most N 
types with non-zero abundance (see section “The maxi¬ 
mum number of coexisting types”). 

Conveniently, verifying that a configuration is a final 
equilibrium is much easier than finding it: one only needs 
to check that the resource surplus A g is zero for all com¬ 
petitors that survived and is negative for all those who 
went extinct. This verification is fast and is guaranteed 
to either confirm that the equilibrium state is correct, 
or provide a list of species that can invade it. There¬ 
fore, a simple heuristic procedure can construct the true 
equilibrium configuration through an iterated sequence 
of “guesses”, whereby a subset of species is first equili¬ 
brated, and then updated by removing species that went 
extinct and adding those that can invade. This is the 
approach adopted here. 

Specifically, calculations were performed in Matlab 
(Mathworks, Inc.). Availability of all resources was set 
to R = 100. The “initial guess” So is constructed using 
the individual fitness criterion explained in the main text 
(low cost per pathway = high fitness): for each path¬ 
way i, the 10 most cost-efficient (lowest cost per path¬ 
way) functional types (Sq' > ) that contained pathway i 
are determined; the union of these cost-efficient types, 
all taken at equal abundance of 1 unit, constitutes the 
“initial guess” So = U j SqE 
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The following procedure is then iterated: community 
dynamics are simulated using MatLab’s variable-order 
differential equation solver odel5s until the absolute 
magnitude of all time derivatives fall below thresh¬ 
old 10 _4 e. At this point, most of the very-low-abundance 
species still present in the community are in the pro¬ 
cess of exponential extinction. To ensure that all such 
low-abundance types are indeed going extinct, all types 
with abundance below 10 -4 are removed from the pop¬ 
ulation, the pruned community is re-equilibrated (to ac¬ 
count for any tiny adjustments this removal might have 
caused), and the resulting state C* is tested for being 
a non-invadable equilibrium. If any invaders are found, 
they are added to the community at abundance 1, and 
the simulation cycle is repeated. Otherwise (no species 
can invade), the configuration is accepted as being within 
the pre-determined numerical error of the true final equi¬ 
librium. This protocol ensures that in the community 
C*, the list of survivors is exact (because the invadabil- 
ity criterion is always checked for all competing species 
and is exact), and their abundance is within acceptable 
numerical error. The protocol always converged due to 
convexity of “community fitness” F. 

Scripts performing calculations and reproducing 
Figs. 2-5 are available upon request. 


6. Supplementary information for Figure 2B 

Figure 2B was generated as follows. For a given cost 
structure, 10 random subsets f \ of 100 types each were 
equilibrated to determine survivors S*. The procedure 
was repeated for 10 random realizations of the cost struc¬ 
ture at each e, with e ranging from 10~ 5 to 0.1. Thus 
for each value of e, a total of 100 randomly constructed 
communities were evaluated. Fig. 2B shows the median 
performance rank of survivors S* within the respective 
set of competitors, averaged over all 100 instances, where 
the median was either weighted (blue dashed line) or not 
weighted (red solid line) by abundance of the type at 
equilibrium. 


7. Initial conditions for Fig. 3 

The trajectories displayed in Fig. 3 were simulated for 
time T = 10 6 starting from 10 random initial conditions 
whereby each of the 1023 types was set to an abundance 
value drawn out of a log-uniform distribution between 
10~ 5 and 100. 



